Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 430 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 53.9 KiB |
| Average record size in memory | 128.3 B |
Variable types
| Categorical | 7 |
|---|---|
| Numeric | 9 |
model has a high cardinality: 119 distinct values | High cardinality |
ROM is highly correlated with screen_size and 3 other fields | High correlation |
RAM is highly correlated with screen_size and 5 other fields | High correlation |
display_size is highly correlated with brand and 9 other fields | High correlation |
num_rear_camera is highly correlated with processor and 4 other fields | High correlation |
battery_capacity is highly correlated with brand and 8 other fields | High correlation |
ratings is highly correlated with brand and 6 other fields | High correlation |
num_of_ratings is highly correlated with sales | High correlation |
sales_price is highly correlated with brand and 8 other fields | High correlation |
sales is highly correlated with num_of_ratings | High correlation |
processor is highly correlated with brand and 6 other fields | High correlation |
brand is highly correlated with processor and 5 other fields | High correlation |
base_color is highly correlated with num_front_camera and 1 other fields | High correlation |
screen_size is highly correlated with brand and 6 other fields | High correlation |
num_front_camera is highly correlated with base_color and 1 other fields | High correlation |
discount_percent is highly correlated with display_size and 1 other fields | High correlation |
Reproduction
| Analysis started | 2022-09-12 15:48:40.910986 |
|---|---|
| Analysis finished | 2022-09-12 15:49:09.750303 |
| Duration | 28.84 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| Realme | |
|---|---|
| Samsung | |
| Xiaomi | |
| Apple | |
| Poco |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 5.886046512 |
| Min length | 4 |
Characters and Unicode
| Total characters | 2531 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Apple |
|---|---|
| 2nd row | Apple |
| 3rd row | Apple |
| 4th row | Apple |
| 5th row | Apple |
Common Values
| Value | Count | Frequency (%) |
| Realme | 138 | |
| Samsung | 119 | |
| Xiaomi | 61 | |
| Apple | 56 | |
| Poco | 56 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| realme | 138 | |
| samsung | 119 | |
| xiaomi | 61 | |
| apple | 56 | |
| poco | 56 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 332 | |
| a | 318 | |
| m | 318 | |
| l | 194 | 7.7% |
| o | 173 | 6.8% |
| R | 138 | 5.5% |
| i | 122 | 4.8% |
| g | 119 | 4.7% |
| n | 119 | 4.7% |
| u | 119 | 4.7% |
| Other values (7) | 579 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2101 | |
| Uppercase Letter | 430 | 17.0% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 332 | |
| a | 318 | |
| m | 318 | |
| l | 194 | |
| o | 173 | |
| i | 122 | 5.8% |
| g | 119 | 5.7% |
| n | 119 | 5.7% |
| u | 119 | 5.7% |
| s | 119 | 5.7% |
| Other values (2) | 168 |
Uppercase Letter
| Value | Count | Frequency (%) |
| R | 138 | |
| S | 119 | |
| X | 61 | |
| A | 56 | |
| P | 56 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2531 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 332 | |
| a | 318 | |
| m | 318 | |
| l | 194 | 7.7% |
| o | 173 | 6.8% |
| R | 138 | 5.5% |
| i | 122 | 4.8% |
| g | 119 | 4.7% |
| n | 119 | 4.7% |
| u | 119 | 4.7% |
| Other values (7) | 579 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2531 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 332 | |
| a | 318 | |
| m | 318 | |
| l | 194 | 7.7% |
| o | 173 | 6.8% |
| R | 138 | 5.5% |
| i | 122 | 4.8% |
| g | 119 | 4.7% |
| n | 119 | 4.7% |
| u | 119 | 4.7% |
| Other values (7) | 579 |
| Distinct | 119 |
|---|---|
| Distinct (%) | 27.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| iPhone XR | 18 |
|---|---|
| iPhone 12 | 17 |
| iPhone 12 Mini | 16 |
| GT Master Edition | 9 |
| X3 | 9 |
| Other values (114) |
Length
| Max length | 23 |
|---|---|
| Median length | 15 |
| Mean length | 8.651162791 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3720 |
|---|---|
| Distinct characters | 48 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 28 ? |
|---|---|
| Unique (%) | 6.5% |
Sample
| 1st row | iPhone SE |
|---|---|
| 2nd row | iPhone 12 Mini |
| 3rd row | iPhone SE |
| 4th row | iPhone XR |
| 5th row | iPhone 12 |
Common Values
| Value | Count | Frequency (%) |
| iPhone XR | 18 | 4.2% |
| iPhone 12 | 17 | 4.0% |
| iPhone 12 Mini | 16 | 3.7% |
| GT Master Edition | 9 | 2.1% |
| X3 | 9 | 2.1% |
| M3 | 9 | 2.1% |
| M2 Pro | 9 | 2.1% |
| Galaxy A21s | 7 | 1.6% |
| Narzo 30 | 6 | 1.4% |
| Galaxy F62 | 6 | 1.4% |
| Other values (109) | 324 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| galaxy | 117 | 12.6% |
| pro | 71 | 7.7% |
| iphone | 56 | 6.0% |
| 5g | 48 | 5.2% |
| redmi | 41 | 4.4% |
| 12 | 33 | 3.6% |
| narzo | 31 | 3.3% |
| note | 26 | 2.8% |
| x3 | 21 | 2.3% |
| mi | 20 | 2.2% |
| Other values (99) | 464 |
Most occurring characters
| Value | Count | Frequency (%) |
| 498 | 13.4% | |
| a | 288 | 7.7% |
| o | 212 | 5.7% |
| i | 187 | 5.0% |
| G | 185 | 5.0% |
| 2 | 170 | 4.6% |
| e | 150 | 4.0% |
| 1 | 138 | 3.7% |
| l | 133 | 3.6% |
| P | 129 | 3.5% |
| Other values (38) | 1630 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1727 | |
| Uppercase Letter | 819 | |
| Decimal Number | 674 | 18.1% |
| Space Separator | 498 | 13.4% |
| Dash Punctuation | 2 | 0.1% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 185 | |
| P | 129 | |
| M | 95 | |
| A | 69 | 8.4% |
| R | 61 | 7.4% |
| N | 58 | 7.1% |
| X | 55 | 6.7% |
| F | 46 | 5.6% |
| C | 34 | 4.2% |
| T | 22 | 2.7% |
| Other values (9) | 65 | 7.9% |
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 288 | |
| o | 212 | |
| i | 187 | |
| e | 150 | |
| l | 133 | |
| x | 123 | |
| r | 121 | |
| y | 117 | |
| n | 83 | 4.8% |
| d | 60 | 3.5% |
| Other values (7) | 253 |
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 170 | |
| 1 | 138 | |
| 3 | 97 | |
| 0 | 86 | |
| 5 | 80 | |
| 7 | 38 | 5.6% |
| 6 | 25 | 3.7% |
| 8 | 24 | 3.6% |
| 4 | 8 | 1.2% |
| 9 | 8 | 1.2% |
Space Separator
| Value | Count | Frequency (%) |
| 498 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2546 | |
| Common | 1174 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 288 | 11.3% |
| o | 212 | 8.3% |
| i | 187 | 7.3% |
| G | 185 | 7.3% |
| e | 150 | 5.9% |
| l | 133 | 5.2% |
| P | 129 | 5.1% |
| x | 123 | 4.8% |
| r | 121 | 4.8% |
| y | 117 | 4.6% |
| Other values (26) | 901 |
Common
| Value | Count | Frequency (%) |
| 498 | ||
| 2 | 170 | 14.5% |
| 1 | 138 | 11.8% |
| 3 | 97 | 8.3% |
| 0 | 86 | 7.3% |
| 5 | 80 | 6.8% |
| 7 | 38 | 3.2% |
| 6 | 25 | 2.1% |
| 8 | 24 | 2.0% |
| 4 | 8 | 0.7% |
| Other values (2) | 10 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3720 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 498 | 13.4% | |
| a | 288 | 7.7% |
| o | 212 | 5.7% |
| i | 187 | 5.0% |
| G | 185 | 5.0% |
| 2 | 170 | 4.6% |
| e | 150 | 4.0% |
| 1 | 138 | 3.7% |
| l | 133 | 3.6% |
| P | 129 | 3.5% |
| Other values (38) | 1630 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 2.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| Blue | |
|---|---|
| Black | |
| White | |
| Silver | |
| Others | |
| Other values (7) |
Length
| Max length | 6 |
|---|---|
| Median length | 5 |
| Mean length | 4.746511628 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2041 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Black |
|---|---|
| 2nd row | Red |
| 3rd row | Red |
| 4th row | Others |
| 5th row | Red |
Common Values
| Value | Count | Frequency (%) |
| Blue | 117 | |
| Black | 112 | |
| White | 44 | 10.2% |
| Silver | 32 | 7.4% |
| Others | 28 | 6.5% |
| Green | 24 | 5.6% |
| Red | 21 | 4.9% |
| Gray | 20 | 4.7% |
| Yellow | 11 | 2.6% |
| Gold | 11 | 2.6% |
| Other values (2) | 10 | 2.3% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| blue | 117 | |
| black | 112 | |
| white | 44 | 10.2% |
| silver | 32 | 7.4% |
| others | 28 | 6.5% |
| green | 24 | 5.6% |
| red | 21 | 4.9% |
| gray | 20 | 4.7% |
| yellow | 11 | 2.6% |
| gold | 11 | 2.6% |
| Other values (2) | 10 | 2.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 311 | |
| l | 299 | |
| B | 234 | |
| a | 132 | 6.5% |
| u | 122 | 6.0% |
| r | 114 | 5.6% |
| k | 112 | 5.5% |
| c | 112 | 5.5% |
| i | 76 | 3.7% |
| h | 72 | 3.5% |
| Other values (17) | 457 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1611 | |
| Uppercase Letter | 430 | 21.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 311 | |
| l | 299 | |
| a | 132 | |
| u | 122 | 7.6% |
| r | 114 | 7.1% |
| k | 112 | 7.0% |
| c | 112 | 7.0% |
| i | 76 | 4.7% |
| h | 72 | 4.5% |
| t | 72 | 4.5% |
| Other values (9) | 189 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 234 | |
| G | 55 | 12.8% |
| W | 44 | 10.2% |
| S | 32 | 7.4% |
| O | 28 | 6.5% |
| R | 21 | 4.9% |
| Y | 11 | 2.6% |
| P | 5 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2041 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 311 | |
| l | 299 | |
| B | 234 | |
| a | 132 | 6.5% |
| u | 122 | 6.0% |
| r | 114 | 5.6% |
| k | 112 | 5.5% |
| c | 112 | 5.5% |
| i | 76 | 3.7% |
| h | 72 | 3.5% |
| Other values (17) | 457 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2041 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 311 | |
| l | 299 | |
| B | 234 | |
| a | 132 | 6.5% |
| u | 122 | 6.0% |
| r | 114 | 5.6% |
| k | 112 | 5.5% |
| c | 112 | 5.5% |
| i | 76 | 3.7% |
| h | 72 | 3.5% |
| Other values (17) | 457 |
| Distinct | 7 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| Qualcomm | |
|---|---|
| MediaTek | |
| Exynos | |
| Ceramic | |
| iOS | 12 |
| Other values (2) |
Length
| Max length | 8 |
|---|---|
| Median length | 8 |
| Mean length | 7.418604651 |
| Min length | 3 |
Characters and Unicode
| Total characters | 3190 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Water |
|---|---|
| 2nd row | Ceramic |
| 3rd row | Water |
| 4th row | iOS |
| 5th row | Ceramic |
Common Values
| Value | Count | Frequency (%) |
| Qualcomm | 168 | |
| MediaTek | 144 | |
| Exynos | 53 | 12.3% |
| Ceramic | 33 | 7.7% |
| iOS | 12 | 2.8% |
| Water | 11 | 2.6% |
| Others | 9 | 2.1% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| qualcomm | 168 | |
| mediatek | 144 | |
| exynos | 53 | 12.3% |
| ceramic | 33 | 7.7% |
| ios | 12 | 2.8% |
| water | 11 | 2.6% |
| others | 9 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| m | 369 | |
| a | 356 | |
| e | 341 | 10.7% |
| o | 221 | 6.9% |
| c | 201 | 6.3% |
| i | 189 | 5.9% |
| Q | 168 | 5.3% |
| l | 168 | 5.3% |
| u | 168 | 5.3% |
| T | 144 | 4.5% |
| Other values (15) | 865 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 2604 | |
| Uppercase Letter | 586 | 18.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| m | 369 | |
| a | 356 | |
| e | 341 | |
| o | 221 | |
| c | 201 | |
| i | 189 | |
| l | 168 | |
| u | 168 | |
| k | 144 | 5.5% |
| d | 144 | 5.5% |
| Other values (7) | 303 |
Uppercase Letter
| Value | Count | Frequency (%) |
| Q | 168 | |
| T | 144 | |
| M | 144 | |
| E | 53 | 9.0% |
| C | 33 | 5.6% |
| O | 21 | 3.6% |
| S | 12 | 2.0% |
| W | 11 | 1.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3190 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| m | 369 | |
| a | 356 | |
| e | 341 | 10.7% |
| o | 221 | 6.9% |
| c | 201 | 6.3% |
| i | 189 | 5.9% |
| Q | 168 | 5.3% |
| l | 168 | 5.3% |
| u | 168 | 5.3% |
| T | 144 | 4.5% |
| Other values (15) | 865 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3190 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| m | 369 | |
| a | 356 | |
| e | 341 | 10.7% |
| o | 221 | 6.9% |
| c | 201 | 6.3% |
| i | 189 | 5.9% |
| Q | 168 | 5.3% |
| l | 168 | 5.3% |
| u | 168 | 5.3% |
| T | 144 | 4.5% |
| Other values (15) | 865 |
| Distinct | 5 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| Large | |
|---|---|
| Medium | |
| Small | |
| Very Small | 4 |
| Very Large | 4 |
Length
| Max length | 10 |
|---|---|
| Median length | 5 |
| Mean length | 5.43255814 |
| Min length | 5 |
Characters and Unicode
| Total characters | 2336 |
|---|---|
| Distinct characters | 15 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Very Small |
|---|---|
| 2nd row | Small |
| 3rd row | Very Small |
| 4th row | Medium |
| 5th row | Medium |
Common Values
| Value | Count | Frequency (%) |
| Large | 242 | |
| Medium | 146 | |
| Small | 34 | 7.9% |
| Very Small | 4 | 0.9% |
| Very Large | 4 | 0.9% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| large | 246 | |
| medium | 146 | |
| small | 38 | 8.7% |
| very | 8 | 1.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 400 | |
| a | 284 | |
| r | 254 | |
| L | 246 | |
| g | 246 | |
| m | 184 | |
| M | 146 | 6.2% |
| d | 146 | 6.2% |
| i | 146 | 6.2% |
| u | 146 | 6.2% |
| Other values (5) | 138 | 5.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1890 | |
| Uppercase Letter | 438 | 18.8% |
| Space Separator | 8 | 0.3% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 400 | |
| a | 284 | |
| r | 254 | |
| g | 246 | |
| m | 184 | |
| d | 146 | 7.7% |
| i | 146 | 7.7% |
| u | 146 | 7.7% |
| l | 76 | 4.0% |
| y | 8 | 0.4% |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 246 | |
| M | 146 | |
| S | 38 | 8.7% |
| V | 8 | 1.8% |
Space Separator
| Value | Count | Frequency (%) |
| 8 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2328 | |
| Common | 8 | 0.3% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 400 | |
| a | 284 | |
| r | 254 | |
| L | 246 | |
| g | 246 | |
| m | 184 | |
| M | 146 | 6.3% |
| d | 146 | 6.3% |
| i | 146 | 6.3% |
| u | 146 | 6.3% |
| Other values (4) | 130 | 5.6% |
Common
| Value | Count | Frequency (%) |
| 8 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2336 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 400 | |
| a | 284 | |
| r | 254 | |
| L | 246 | |
| g | 246 | |
| m | 184 | |
| M | 146 | 6.2% |
| d | 146 | 6.2% |
| i | 146 | 6.2% |
| u | 146 | 6.2% |
| Other values (5) | 138 | 5.9% |
| Distinct | 7 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 105.7488372 |
| Minimum | 8 |
|---|---|
| Maximum | 512 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 32 |
| Q1 | 64 |
| median | 128 |
| Q3 | 128 |
| 95-th percentile | 256 |
| Maximum | 512 |
| Range | 504 |
| Interquartile range (IQR) | 64 |
Descriptive statistics
| Standard deviation | 63.16406421 |
|---|---|
| Coefficient of variation (CV) | 0.5973026832 |
| Kurtosis | 4.281647946 |
| Mean | 105.7488372 |
| Median Absolute Deviation (MAD) | 64 |
| Skewness | 1.495005215 |
| Sum | 45472 |
| Variance | 3989.699008 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 128 | 192 | |
| 64 | 138 | |
| 32 | 54 | 12.6% |
| 256 | 38 | 8.8% |
| 16 | 5 | 1.2% |
| 8 | 2 | 0.5% |
| 512 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 8 | 2 | 0.5% |
| 16 | 5 | 1.2% |
| 32 | 54 | 12.6% |
| 64 | 138 | |
| 128 | 192 | |
| 256 | 38 | 8.8% |
| 512 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 512 | 1 | 0.2% |
| 256 | 38 | 8.8% |
| 128 | 192 | |
| 64 | 138 | |
| 32 | 54 | 12.6% |
| 16 | 5 | 1.2% |
| 8 | 2 | 0.5% |
| Distinct | 7 |
|---|---|
| Distinct (%) | 1.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.320930233 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 8 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.182635235 |
|---|---|
| Coefficient of variation (CV) | 0.4101980555 |
| Kurtosis | 0.503140175 |
| Mean | 5.320930233 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.7468856991 |
| Sum | 2288 |
| Variance | 4.763896569 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=7)
| Value | Count | Frequency (%) |
| 4 | 133 | |
| 6 | 114 | |
| 8 | 88 | |
| 3 | 60 | |
| 2 | 21 | 4.9% |
| 12 | 12 | 2.8% |
| 1 | 2 | 0.5% |
| Value | Count | Frequency (%) |
| 1 | 2 | 0.5% |
| 2 | 21 | 4.9% |
| 3 | 60 | |
| 4 | 133 | |
| 6 | 114 | |
| 8 | 88 | |
| 12 | 12 | 2.8% |
| Value | Count | Frequency (%) |
| 12 | 12 | 2.8% |
| 8 | 88 | |
| 6 | 114 | |
| 4 | 133 | |
| 3 | 60 | |
| 2 | 21 | 4.9% |
| 1 | 2 | 0.5% |
| Distinct | 17 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.369767442 |
| Minimum | 4.7 |
|---|---|
| Maximum | 7.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 4.7 |
|---|---|
| 5-th percentile | 5.445 |
| Q1 | 6.3 |
| median | 6.5 |
| Q3 | 6.5 |
| 95-th percentile | 6.7 |
| Maximum | 7.6 |
| Range | 2.9 |
| Interquartile range (IQR) | 0.2 |
Descriptive statistics
| Standard deviation | 0.3695488863 |
|---|---|
| Coefficient of variation (CV) | 0.05801607196 |
| Kurtosis | 5.152325006 |
| Mean | 6.369767442 |
| Median Absolute Deviation (MAD) | 0.1 |
| Skewness | -1.553611897 |
| Sum | 2739 |
| Variance | 0.1365663794 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=17)
| Value | Count | Frequency (%) |
| 6.5 | 164 | |
| 6.4 | 64 | 14.9% |
| 6.7 | 62 | 14.4% |
| 6.1 | 43 | 10.0% |
| 6.3 | 22 | 5.1% |
| 5.4 | 16 | 3.7% |
| 6.6 | 14 | 3.3% |
| 6.2 | 12 | 2.8% |
| 5.8 | 6 | 1.4% |
| 5.5 | 6 | 1.4% |
| Other values (7) | 21 | 4.9% |
| Value | Count | Frequency (%) |
| 4.7 | 4 | 0.9% |
| 5.2 | 2 | 0.5% |
| 5.4 | 16 | 3.7% |
| 5.5 | 6 | 1.4% |
| 5.6 | 1 | 0.2% |
| 5.7 | 3 | 0.7% |
| 5.8 | 6 | 1.4% |
| 6 | 5 | 1.2% |
| 6.1 | 43 | |
| 6.2 | 12 | 2.8% |
| Value | Count | Frequency (%) |
| 7.6 | 4 | 0.9% |
| 6.9 | 2 | 0.5% |
| 6.7 | 62 | 14.4% |
| 6.6 | 14 | 3.3% |
| 6.5 | 164 | |
| 6.4 | 64 | 14.9% |
| 6.3 | 22 | 5.1% |
| 6.2 | 12 | 2.8% |
| 6.1 | 43 | 10.0% |
| 6 | 5 | 1.2% |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| 3 | |
|---|---|
| 4 | |
| 2 | |
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 430 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 2 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 2 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 430 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 430 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 430 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 157 | |
| 4 | 136 | |
| 2 | 97 | |
| 1 | 40 | 9.3% |
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.5 KiB |
| 1 | |
|---|---|
| 2 | 15 |
| 3 | 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 430 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 430 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 430 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 430 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 413 | |
| 2 | 15 | 3.5% |
| 3 | 2 | 0.5% |
| Distinct | 30 |
|---|---|
| Distinct (%) | 7.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4529.397674 |
| Minimum | 1800 |
|---|---|
| Maximum | 7000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 1800 |
|---|---|
| 5-th percentile | 2815 |
| Q1 | 4000 |
| median | 4500 |
| Q3 | 5000 |
| 95-th percentile | 6000 |
| Maximum | 7000 |
| Range | 5200 |
| Interquartile range (IQR) | 1000 |
Descriptive statistics
| Standard deviation | 986.9072515 |
|---|---|
| Coefficient of variation (CV) | 0.2178892918 |
| Kurtosis | 0.05662835901 |
| Mean | 4529.397674 |
| Median Absolute Deviation (MAD) | 500 |
| Skewness | -0.2838950755 |
| Sum | 1947641 |
| Variance | 973985.9231 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=30)
| Value | Count | Frequency (%) |
| 5000 | 129 | |
| 6000 | 54 | |
| 4000 | 49 | 11.4% |
| 4500 | 42 | 9.8% |
| 2815 | 33 | 7.7% |
| 2942 | 18 | 4.2% |
| 4300 | 16 | 3.7% |
| 4200 | 9 | 2.1% |
| 3300 | 8 | 1.9% |
| 7000 | 6 | 1.4% |
| Other values (20) | 66 |
| Value | Count | Frequency (%) |
| 1800 | 5 | 1.2% |
| 2600 | 2 | 0.5% |
| 2815 | 33 | |
| 2942 | 18 | 4.2% |
| 3000 | 2 | 0.5% |
| 3080 | 2 | 0.5% |
| 3300 | 8 | 1.9% |
| 3400 | 2 | 0.5% |
| 3700 | 1 | 0.2% |
| 4000 | 49 |
| Value | Count | Frequency (%) |
| 7000 | 6 | 1.4% |
| 6000 | 54 | |
| 5160 | 6 | 1.4% |
| 5065 | 6 | 1.4% |
| 5020 | 4 | 0.9% |
| 5000 | 129 | |
| 4820 | 2 | 0.5% |
| 4800 | 1 | 0.2% |
| 4780 | 1 | 0.2% |
| 4520 | 3 | 0.7% |
| Distinct | 10 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.339302326 |
| Minimum | 3 |
|---|---|
| Maximum | 4.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 4.1 |
| Q1 | 4.3 |
| median | 4.3 |
| Q3 | 4.4 |
| 95-th percentile | 4.6 |
| Maximum | 4.6 |
| Range | 1.6 |
| Interquartile range (IQR) | 0.1 |
Descriptive statistics
| Standard deviation | 0.1514944259 |
|---|---|
| Coefficient of variation (CV) | 0.03491216202 |
| Kurtosis | 14.04302502 |
| Mean | 4.339302326 |
| Median Absolute Deviation (MAD) | 0.1 |
| Skewness | -1.73239576 |
| Sum | 1865.9 |
| Variance | 0.02295056107 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) |
| 4.3 | 181 | |
| 4.4 | 79 | |
| 4.5 | 55 | 12.8% |
| 4.2 | 55 | 12.8% |
| 4.6 | 37 | 8.6% |
| 4 | 10 | 2.3% |
| 4.1 | 8 | 1.9% |
| 3.9 | 3 | 0.7% |
| 3 | 1 | 0.2% |
| 3.8 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 3 | 1 | 0.2% |
| 3.8 | 1 | 0.2% |
| 3.9 | 3 | 0.7% |
| 4 | 10 | 2.3% |
| 4.1 | 8 | 1.9% |
| 4.2 | 55 | 12.8% |
| 4.3 | 181 | |
| 4.4 | 79 | |
| 4.5 | 55 | 12.8% |
| 4.6 | 37 | 8.6% |
| Value | Count | Frequency (%) |
| 4.6 | 37 | 8.6% |
| 4.5 | 55 | 12.8% |
| 4.4 | 79 | |
| 4.3 | 181 | |
| 4.2 | 55 | 12.8% |
| 4.1 | 8 | 1.9% |
| 4 | 10 | 2.3% |
| 3.9 | 3 | 0.7% |
| 3.8 | 1 | 0.2% |
| 3 | 1 | 0.2% |
| Distinct | 175 |
|---|---|
| Distinct (%) | 40.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23567.94419 |
| Minimum | 4 |
|---|---|
| Maximum | 642373 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 745 |
| median | 5197.5 |
| Q3 | 21089.25 |
| 95-th percentile | 123155.45 |
| Maximum | 642373 |
| Range | 642369 |
| Interquartile range (IQR) | 20344.25 |
Descriptive statistics
| Standard deviation | 56096.27778 |
|---|---|
| Coefficient of variation (CV) | 2.380193934 |
| Kurtosis | 47.93020113 |
| Mean | 23567.94419 |
| Median Absolute Deviation (MAD) | 4964.5 |
| Skewness | 5.850072852 |
| Sum | 10134216 |
| Variance | 3146792381 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5366 | 18 | 4.2% |
| 745 | 17 | 4.0% |
| 244 | 16 | 3.7% |
| 7 | 6 | 1.4% |
| 61812 | 6 | 1.4% |
| 105 | 6 | 1.4% |
| 15016 | 6 | 1.4% |
| 39881 | 6 | 1.4% |
| 33364 | 6 | 1.4% |
| 26 | 5 | 1.2% |
| Other values (165) | 338 |
| Value | Count | Frequency (%) |
| 4 | 3 | |
| 6 | 2 | 0.5% |
| 7 | 6 | |
| 8 | 4 | |
| 10 | 1 | 0.2% |
| 16 | 2 | 0.5% |
| 19 | 3 | |
| 23 | 3 | |
| 26 | 5 | |
| 35 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 642373 | 1 | 0.2% |
| 470905 | 1 | 0.2% |
| 357064 | 1 | 0.2% |
| 267028 | 1 | 0.2% |
| 226996 | 3 | |
| 223672 | 2 | |
| 155242 | 2 | |
| 141177 | 2 | |
| 129661 | 2 | |
| 125016 | 3 |
| Distinct | 141 |
|---|---|
| Distinct (%) | 32.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25433.23488 |
| Minimum | 5742 |
|---|---|
| Maximum | 157999 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 5742 |
|---|---|
| 5-th percentile | 8287.45 |
| Q1 | 11999 |
| median | 16989.5 |
| Q3 | 28999 |
| 95-th percentile | 72149 |
| Maximum | 157999 |
| Range | 152257 |
| Interquartile range (IQR) | 17000 |
Descriptive statistics
| Standard deviation | 22471.92659 |
|---|---|
| Coefficient of variation (CV) | 0.8835654092 |
| Kurtosis | 8.980504019 |
| Mean | 25433.23488 |
| Median Absolute Deviation (MAD) | 6490.5 |
| Skewness | 2.595227695 |
| Sum | 10936291 |
| Variance | 504987484.6 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 14999 | 22 | 5.1% |
| 15999 | 14 | 3.3% |
| 8999 | 12 | 2.8% |
| 9999 | 12 | 2.8% |
| 16999 | 11 | 2.6% |
| 11499 | 11 | 2.6% |
| 10499 | 10 | 2.3% |
| 42999 | 9 | 2.1% |
| 47999 | 9 | 2.1% |
| 21999 | 9 | 2.1% |
| Other values (131) | 311 |
| Value | Count | Frequency (%) |
| 5742 | 1 | 0.2% |
| 6499 | 2 | 0.5% |
| 6890 | 1 | 0.2% |
| 6999 | 1 | 0.2% |
| 7299 | 2 | 0.5% |
| 7499 | 5 | |
| 7990 | 1 | 0.2% |
| 7999 | 4 | |
| 8083 | 1 | 0.2% |
| 8190 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 157999 | 1 | 0.2% |
| 149999 | 3 | |
| 91999 | 2 | 0.5% |
| 88999 | 2 | 0.5% |
| 84999 | 2 | 0.5% |
| 79149 | 6 | |
| 77999 | 2 | 0.5% |
| 73999 | 1 | 0.2% |
| 72149 | 6 | |
| 71999 | 1 | 0.2% |
| Distinct | 33 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.108 |
| Minimum | 0.01 |
|---|---|
| Maximum | 0.44 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.02 |
| Q1 | 0.06 |
| median | 0.09 |
| Q3 | 0.16 |
| 95-th percentile | 0.24 |
| Maximum | 0.44 |
| Range | 0.43 |
| Interquartile range (IQR) | 0.1 |
Descriptive statistics
| Standard deviation | 0.07343201667 |
|---|---|
| Coefficient of variation (CV) | 0.6799260803 |
| Kurtosis | 2.269150088 |
| Mean | 0.108 |
| Median Absolute Deviation (MAD) | 0.04 |
| Skewness | 1.301589636 |
| Sum | 46.44 |
| Variance | 0.005392261072 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=33)
| Value | Count | Frequency (%) |
| 0.09 | 44 | 10.2% |
| 0.08 | 38 | 8.8% |
| 0.04 | 32 | 7.4% |
| 0.07 | 29 | 6.7% |
| 0.1 | 27 | 6.3% |
| 0.06 | 25 | 5.8% |
| 0.02 | 23 | 5.3% |
| 0.05 | 23 | 5.3% |
| 0.2 | 22 | 5.1% |
| 0.16 | 21 | 4.9% |
| Other values (23) | 146 |
| Value | Count | Frequency (%) |
| 0.01 | 11 | 2.6% |
| 0.02 | 23 | |
| 0.03 | 17 | 4.0% |
| 0.04 | 32 | |
| 0.05 | 23 | |
| 0.06 | 25 | |
| 0.07 | 29 | |
| 0.08 | 38 | |
| 0.09 | 44 | |
| 0.1 | 27 |
| Value | Count | Frequency (%) |
| 0.44 | 1 | 0.2% |
| 0.43 | 1 | 0.2% |
| 0.39 | 2 | |
| 0.36 | 1 | 0.2% |
| 0.31 | 2 | |
| 0.3 | 4 | |
| 0.29 | 3 | |
| 0.28 | 3 | |
| 0.25 | 3 | |
| 0.24 | 3 |
| Distinct | 216 |
|---|---|
| Distinct (%) | 50.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.75232558 |
| Minimum | 0 |
|---|---|
| Maximum | 550.19 |
| Zeros | 3 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 3.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.079 |
| Q1 | 1.64 |
| median | 9.655 |
| Q3 | 29.7175 |
| 95-th percentile | 130.29 |
| Maximum | 550.19 |
| Range | 550.19 |
| Interquartile range (IQR) | 28.0775 |
Descriptive statistics
| Standard deviation | 58.39958786 |
|---|---|
| Coefficient of variation (CV) | 1.962857918 |
| Kurtosis | 31.58655786 |
| Mean | 29.75232558 |
| Median Absolute Deviation (MAD) | 9.035 |
| Skewness | 4.789041038 |
| Sum | 12793.5 |
| Variance | 3410.511862 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 23.07 | 9 | 2.1% |
| 25.76 | 9 | 2.1% |
| 5.15 | 6 | 1.4% |
| 5.9 | 6 | 1.4% |
| 1.76 | 6 | 1.4% |
| 13.81 | 5 | 1.2% |
| 10 | 5 | 1.2% |
| 4.78 | 5 | 1.2% |
| 1.52 | 5 | 1.2% |
| 1.39 | 5 | 1.2% |
| Other values (206) | 369 |
| Value | Count | Frequency (%) |
| 0 | 3 | |
| 0.01 | 3 | |
| 0.02 | 2 | |
| 0.03 | 3 | |
| 0.05 | 3 | |
| 0.06 | 4 | |
| 0.07 | 4 | |
| 0.09 | 2 | |
| 0.1 | 2 | |
| 0.11 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 550.19 | 1 | |
| 493.98 | 1 | |
| 427.22 | 1 | |
| 392.73 | 1 | |
| 231.79 | 1 | |
| 204 | 1 | |
| 182.12 | 1 | |
| 175.04 | 1 | |
| 174.9 | 1 | |
| 173.75 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| brand | model | base_color | processor | screen_size | ROM | RAM | display_size | num_rear_camera | num_front_camera | battery_capacity | ratings | num_of_ratings | sales_price | discount_percent | sales | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Apple | iPhone SE | Black | Water | Very Small | 64 | 2 | 4.7 | 1 | 1 | 1800 | 4.5 | 38645 | 32999 | 0.17 | 127.52 |
| 1 | Apple | iPhone 12 Mini | Red | Ceramic | Small | 64 | 4 | 5.4 | 2 | 1 | 2815 | 4.5 | 244 | 57149 | 0.04 | 1.39 |
| 2 | Apple | iPhone SE | Red | Water | Very Small | 64 | 2 | 4.7 | 1 | 1 | 1800 | 4.5 | 38645 | 32999 | 0.17 | 127.52 |
| 3 | Apple | iPhone XR | Others | iOS | Medium | 64 | 3 | 6.1 | 1 | 1 | 2942 | 4.6 | 5366 | 42999 | 0.10 | 23.07 |
| 4 | Apple | iPhone 12 | Red | Ceramic | Medium | 128 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 69149 | 0.02 | 5.15 |
| 5 | Apple | iPhone 12 | Blue | Ceramic | Medium | 64 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 64149 | 0.02 | 4.78 |
| 6 | Apple | iPhone 12 | White | Ceramic | Medium | 128 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 69149 | 0.02 | 5.15 |
| 7 | Apple | iPhone 12 | Green | Ceramic | Medium | 64 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 64149 | 0.02 | 4.78 |
| 8 | Apple | iPhone 12 | Blue | Ceramic | Medium | 128 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 69149 | 0.02 | 5.15 |
| 9 | Apple | iPhone 12 | Black | Ceramic | Medium | 128 | 4 | 6.1 | 2 | 1 | 2815 | 4.6 | 745 | 69149 | 0.02 | 5.15 |
Last rows
| brand | model | base_color | processor | screen_size | ROM | RAM | display_size | num_rear_camera | num_front_camera | battery_capacity | ratings | num_of_ratings | sales_price | discount_percent | sales | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 420 | Xiaomi | Mi 10i | Black | Qualcomm | Large | 128 | 8 | 6.7 | 4 | 1 | 4820 | 4.3 | 663 | 24215 | 0.06 | 1.61 |
| 421 | Xiaomi | Redmi Note 9 Pro | Blue | Qualcomm | Large | 128 | 4 | 6.7 | 2 | 1 | 5020 | 4.4 | 6106 | 14199 | 0.03 | 8.67 |
| 422 | Xiaomi | Redmi Note 9 Pro | Blue | Qualcomm | Large | 128 | 6 | 6.7 | 4 | 1 | 5020 | 4.3 | 434 | 14999 | 0.06 | 0.65 |
| 423 | Xiaomi | Redmi Y3 | Red | Qualcomm | Medium | 32 | 3 | 6.3 | 2 | 1 | 4000 | 4.4 | 6844 | 8252 | 0.31 | 5.65 |
| 424 | Xiaomi | Redmi 5 | Blue | Qualcomm | Small | 16 | 2 | 5.7 | 1 | 1 | 3300 | 4.3 | 4267 | 6890 | 0.18 | 2.94 |
| 425 | Xiaomi | Redmi 6 Pro | Black | Qualcomm | Small | 32 | 3 | 5.8 | 2 | 1 | 4000 | 4.3 | 1870 | 7999 | 0.30 | 1.50 |
| 426 | Xiaomi | Redmi 6 Pro | Red | Qualcomm | Small | 64 | 4 | 5.8 | 2 | 1 | 4000 | 4.3 | 1783 | 9699 | 0.28 | 1.73 |
| 427 | Xiaomi | Mi 11 Lite | Others | Qualcomm | Large | 128 | 6 | 6.5 | 3 | 1 | 4250 | 4.2 | 1554 | 21999 | 0.12 | 3.42 |
| 428 | Xiaomi | Redmi 8A Dual | Blue | Qualcomm | Medium | 32 | 3 | 6.2 | 2 | 1 | 5000 | 4.2 | 8161 | 8299 | 0.07 | 6.77 |
| 429 | Xiaomi | Redmi 6 Pro | Blue | Qualcomm | Small | 32 | 3 | 5.8 | 2 | 1 | 4000 | 4.3 | 1870 | 8190 | 0.36 | 1.53 |